Add datetime aggregation fixes#144
Conversation
This comment was marked as spam.
This comment was marked as spam.
| case TIME: | ||
| return MILLIS.between(LocalTime.MIN, expr.timeValue()); |
There was a problem hiding this comment.
Workaround for sesstion context issue
opensearch/src/main/java/org/opensearch/sql/opensearch/response/agg/Utils.java
Show resolved
Hide resolved
opensearch/src/main/java/org/opensearch/sql/opensearch/response/agg/Utils.java
Show resolved
Hide resolved
| .shouldMatch(new ExprTimestampValue("1984-03-17 22:16:42") | ||
| .timestampValue().toEpochMilli()); | ||
| } | ||
|
|
There was a problem hiding this comment.
should we add the all-nulls case here too?
There was a problem hiding this comment.
I'm planning to add ITs that cover too.
There was a problem hiding this comment.
Existing UT framework is very limited and can't properly cover such case. There is a test for that and it didn't detect such case:
| case TIME: | ||
| state.total += MILLIS.between(LocalTime.MIN, value.timeValue()); | ||
| break; | ||
| case DOUBLE: |
There was a problem hiding this comment.
we cannot compute AVG for SHORT, INTEGER, FLOAT, etc...?
There was a problem hiding this comment.
These types are casted to DOUBLE automatically
There was a problem hiding this comment.
Should the automatic casting of types to double be tested in AvgAggregatorTest? I think only INTEGER and DOUBLE are only in there for number values.
core/src/test/java/org/opensearch/sql/expression/aggregation/AvgAggregatorTest.java
Outdated
Show resolved
Hide resolved
| public static long extractEpochMilliFromAnyDateTimeType(ExprValue value) { | ||
| switch ((ExprCoreType)value.type()) { | ||
| case DATE: | ||
| return MILLIS.between(LocalDateTime.of(1970, 1, 1, 0, 0), value.datetimeValue()); |
There was a problem hiding this comment.
I may be wrong, but it seems like we could use the Instant.toEpochMilli for this call too. Something like:
value.datetimeValue().toInstant().toEpochMilli();
There was a problem hiding this comment.
Nice catch!
Initially case DATE: fallen back with TIMESTAMP and DATETIME, conversion thru timestamp (Instant) returned wrong result, because timestamp changes TZ to UTC.
DATE -> MILLIS -> DATE -> MILLIS check failed.
There was a problem hiding this comment.
Thank you. While thinking on your comment I found a mistake. Fixed in 478141e.
Nice catch! It is fixed in #145 (not merged yet on upstream - opensearch-project#1000). |
I see. Thanks. LGTM! |
I think we should add integration tests for these cases once the PR upstream is merged. |
db98434 to
435e940
Compare
* Add aggregator fixes and some useless unit tests. * Add `avg` aggregation on datetime types. * Rework in-memory `AVG`. Fix parsing value returned from the OpenSearch node. Signed-off-by: Yury-Fridlyand <yuryf@bitquilltech.com>
435e940 to
31a72d4
Compare
Done! Thank you for the idea. |


Signed-off-by: Yury-Fridlyand yuryf@bitquilltech.com
Description
Make listed aggregations work with datetime types.
Fixes
minmaxavgNot included
var_sampvar_popstddev_sampstddev_popsum[1]Implementation details:
Presicion
Aggregation done with milliseconds precision - this follows OpenSearch approach. See code snippets: one, two.
We convert datetimes to millis and back on our own (see below), so we can scale up to nanoseconds. Such fix requires additional changes - few tests fail.
ExpressionAggregationScript::executeA callback function, called by OpenSearch node to extract a value during processing aggregation script.
This added for backward compatibility:
opensearch-project-sql/opensearch/src/main/java/org/opensearch/sql/opensearch/storage/script/aggregation/ExpressionAggregationScript.java
Lines 53 to 55 in 272bf67
This - to extract value. Fortunately,
toEpochMilli()returns negative values for pre-Epoch timestamps, so we are able to use it for group/compare any values.opensearch-project-sql/opensearch/src/main/java/org/opensearch/sql/opensearch/storage/script/aggregation/ExpressionAggregationScript.java
Lines 56 to 67 in 272bf67
OpenSearch accepts dates in
Jodalib types and converts to milliseconds since Epoch. We have java datetime types, to I do conversion there.https://github.com/opensearch-project/OpenSearch/blob/140e8d3e6c91519edc47be07b4cd053fdfac1769/server/src/main/java/org/opensearch/search/aggregations/support/values/ScriptDoubleValues.java#L123
OpenSearchExprValueFactoryparseTimestampandconstructTimestampOpenSearch returns aggregated values as strings, even milliseconds since Epoch. I have to extract it, so I combined these functions together.
opensearch-project-sql/opensearch/src/main/java/org/opensearch/sql/opensearch/data/value/OpenSearchExprValueFactory.java
Lines 178 to 198 in 975317c
ExprValueUtilsThese two methods used by
avgin-memory aggregation for datetime types.opensearch-project-sql/core/src/main/java/org/opensearch/sql/data/model/ExprValueUtils.java
Line 190 in 272bf67
opensearch-project-sql/core/src/main/java/org/opensearch/sql/data/model/ExprValueUtils.java
Line 213 in 272bf67
AvgAggregatorAvgStatewas renamed toDoubleAvgStateDateTimeAvgStatedoes the same logic, but for datetime types.opensearch-project-sql/core/src/main/java/org/opensearch/sql/expression/aggregation/AvgAggregator.java
Lines 105 to 117 in 272bf67
AvgStateAggregatorFunctionNew signature were added.
opensearch-project-sql/core/src/main/java/org/opensearch/sql/expression/aggregation/AggregatorFunction.java
Lines 69 to 76 in 272bf67
Actually, plugin was able to do push down aggregation on datetime types, but this weren't accepted.
[1] It works on MySQL, but I don't see any reason to implement this for datetime types.
Test queries
In-memory aggregation
opensearch-project-sql/integ-test/src/test/java/org/opensearch/sql/sql/AggregationIT.java
Lines 196 to 199 in 17e1ae9
(cast required to ensure that you're working with
TIME).To try other types, use
Push Down aggregation (
metric)Push Down aggregation (
composite_buckets)Issues Resolved
opensearch-project#645
Check List
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.